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Abstract 

Let po be an invariant probability density of a deterministic dynamical system / and pt the invariant probability 
density of a random perturbation of / by additive noise of amplitude e. Suppose po is stochastically stable in the 
sense that p^ — > po as e — > 0. Through a systematic numerical study of concrete examples, I show that: 

1. The rate of convergence of p<. to po as e ^ is frequently governed by power laws: |[pe — po[|i ~ e'' for 



►^ ■ some 7 > 0. 

'— ' ' 2. When the deterministic system / exhibits exponential decay of correlations, a simple heuristic can correctly 

fvj ■ predict the exponent 7 based on the structure of po. 

O . 

^^ , of correlations. For these examples, the convergence of p^ to po as e — ^ continues to be governed by power 

("^ ' laws but the heuristic provides only an upper bound on the power law exponent 7. 

i-Q ' Furthermore, this numerical study requires the computation of jlp^ — po||i for 1.5 — 2.5 decades of e and provides 

C^ ' an opportunity to discuss and compare standard numerical methods for computing invariant probability densities in 

ri , some depth. 

> 

X ' 1 Introduction 

Deterministic chaotic systems generally possess large numbers of invariant probability measures. Physical consid- 
erations lead naturally to the study of invariant measures which are stable under small random perturbations. (See 
§1.1.) This paper explores the degree of stability of such stochastically stable invariant measures in the setting of 
discrete-time systems with invariant probability densities. 
More precisely, consider dynamical systems of the form 

Xk+l = f(xk) (1) 

defined by a map f : M O, for example the map x 1-^ 2x + a sin(27ra;) (mod 1) shown in Figure 1. In this paper, the 
space M wil always be the circle S^, an interval in R, or a product of such sets. Thus operations like addition make 

1 



sense in M and there is always a natural reference measure on M, namely the normalized Lebesgue measure. Random 
perturbations of (1) can therefore be defined via Markov chains of the form 

Xk+i = I{xk) +e^fc, (2) 

where the {^f.) are independent, identically-distributed random variables with a common probability density indepen- 
dent of Xfc. If / is expanding {i.e. if the singular values of Df{x) are all > 1) for a sufficiently large set of x G M and 
M is connected, then the map / possesses a unique invariant probability density po (see [1,2] for precise statements 
of known results). An invariant density po is stochastically stable if, for every continuous observable (/>, 

lim 

^^^JM JM 

where /^^ denotes integration with respect to Lebesgue measure on M. When po is stochastically stable, it is natural 
to try to predict the rate with which p^ converges to po as e ^ 0. As the examples in this paper demonstrate, the rate 
of convergence of p^ -^ po can depend on the detailed properties of po and / and may not always be immediately 
apparent. 

As a first step to understanding the factors which can affect the rate of convergence in the limit e ^ 0, this paper 
systematically examines five concrete examples numerically. It is found that: 

1. The rate of convergence of p^ to po as e ^ is often governed by power laws: ||/0e — poHi ^ e'' for some 
7 > 0, where Hp^ — Polli — Jm \p<^ ~ Pol denotes the L^ norm on M. Note that because 



\\Pe-pa\\i= sup / <j)-{p^-po), (3) 

{0:||0IIoo<1}"'M 

convergence in L^ norm implies a uniform rate of convergence of expectation values and is more stringent than 
stochastic stability. 

2. A simple heuristic allows one to predict the exponent 7 based on the structure of the density pa,ina sense to be 
made more precise below (see Equations (7) and (8)). 

3. The heuristic fails for systems with some "intermittency," i.e. systems which do not exhibit exponential decay 
of correlations [15, 17]. For these examples, the convergence of p^ to po continues to be governed by power 
laws but the heuristic provides only an upper bound on the power law exponent 7. 

The notation "/ ~ g" means that there exist positive constants ci andc2 such that ci/(x) < g{x) < C2,f{x). Similarly, 
the notation "/ < g" will be used below to mean that there exists c > such that f{x) < cg{x). 

Explaining the heuristic estimate requires the formalism of Perron-Frobenius transfer operators [1]. Let g be a 
function on M and define the operator Tf by 

The operator Tf reformulates the dynamics in terms of probability densities: if q is the probability density of Xk , then 
TfQ is the probability density of Xk+i = f{xk). This elementary fact follows from the change of variables formula. 
Clearly, po is an invariant density of / if and only if it is an eigenfunction of Tf with eigenvalue 1: Tfpo — po- 



Furthermore, T?q -^ po as n ^ oo if the eigenvalue 1 of Tf is simple and the initial density q is sufficiently regular. 
And, if Tf has a spectral gap, that is if the eigenvalue 1 is an isolated point of the spectrum cr{Tf) of Tf and the radius 
\<j{Tf) \ {1}| of the smallest disc in C containing <j{Tf) \ {1} is strictly less than 1, then the convergence of T'^q to 
po as n ^ oo is exponentially fast. The spectral gap condition also implies the exponential decay of correlations, i.e. 
there exist positive constants c and 9 <1 such that 



(j) ■ (Tftp) ■ po- 0PO • / ippa 
M Jm Jm 



< c6i" (5) 



for all sufficiently smooth observables (p and ip. Systems with exponential decay of correlations are also said to be 
exponentially mixing. Note that 1 > 6 > |cr(T/) \ {1}|. See [1, 2] for details. 

The "noisy" transfer operator is defined in a similar way: let G^ be the averaging operator 

(G,g) (x) - e-'' J^ g (^^^ q{y) dy, (6) 

where g is the common probability density of the IID random variables ^j. and d — dim(M). The operator G^ 
represents the effect of additive noise on a probability density q. The noisy Perron-Frobenius operator T^ is then 

T, = G,Tf. 

As before, the operator Te describes the random dynamics (2) in terms of probability densities. A density p^ is invariant 
under (2) if and only if it is an eigenfunction of T^ of eigenvalue 1 . 

We can now discuss the heuristic. Under fairly general conditions, T^q converges to p^ exponentially fast as 
n ^ oo for fixed e [2]. Suppose pe ^ po as e — > 0. Since lim„^oo T^Po = pe exponentially fast, we may expect 
||T"po ~ Pelli to be small for n finite but sufficiently large. If such an n can be chosen independent of e, it is then 
natural to guess that for all sufficiently small e, 

||P£ -Polli ^ llTTPo-polli- (7) 

Equation (7) states that one can estimate the difference between p^ and po by studying the effect of applying Te a finite 
number of times to the noiseless invariant density po- Note that the rate of convergence of T"po to pe as n ^ oo is 
governed by the size of the spectral gap |cr(T'e) \ {1}| . If |cr(Te) \ {1}| can become arbitrarily small as e — > 0, n may 
have to be very large (or even increase as e decreases) in order for (7) to hold. On the other hand, in exponentially 
mixing systems with a sufficiently large spectral gap \o-{Te) \ {1}| (e > 0), it is likely that one can take n to be a small 
integer independent of e. Equation (7) should be taken only as a rough guideline for what can be expected in studying 
the convergence of p^ to po in the L^ norm in the small noise limit e ^ 0. 

In the exponentially mixing examples of Section 2, it is found empirically that n = \ suffices: 

\\pe - polli ^ \\TePn - polli = \\Gepo - polli- (8) 

This is not unexpected if supQ<g<g|j W{T<i) \ {1}| ^ 1 for some eo > 0. Equation (8) states that one can estimate 
the difference between p^ and po by studying the effect of applying the averaging operator G^ once to the noiseless 
invariant density po- Note that "half" of (8) is always true: because (/ — T(){pe^Po) — GePo — po, the inequality 

IIGePo -Polli < 2||p£ -polli (9) 



Map 


Exponent 7 = lim.^o '°"^' ^^^.^"^^ 


Predicted 


Computed 


Uniformly expanding 


2 


1.996 ±0.0071 


Piecewise expanding 


1 


0.97 ±0.033 


Quadratic (Misiurewicz) 


1 

2 


0.52 ±0.056 
0.46 ±0.050 


Neutral fixed point 


<0.5 

< 0.7 

< 0.3 


0.31 ±0.028 
0.53 ±0.056 
0.17 ±0.033 


Stadium 


<2 


1.40 ±0.022 



Table 1: Summary of results. The first example, a smooth uniformly expanding map, has a smooth invariant density. 
Thus GcPo — Po = \^ p'l) + o (e^) and the heuristic predicts O(e^) convergence as e — > 0. The second example has 
a piecewise continuous invariant density. The heuristic thus predicts 0(e) convergence. The third example has an 
invariant density which contain singularities of the form a;^^/^. The heuristic thus predicts e^/^ convergence. The last 
two examples are not exponentially mixing, and the heuristic only predicts upper bounds. 



always holds. So as e ^ 0, \\p^ — polli cannot converge to faster than ||Gepo ~ Polli and the heuristic always 
provides an upper bound on the power law exponent 7. 

In problems where (8) applies, the effect of G^ on po (and hence the scaling of | |GePo — Pol |i as e -^ 0) depends 
on the structure of p^. if po is very smooth, as in the uniformly expanding map in §2.1, then the effect of G^ will be 
to simply "flatten" po, '-e. decrease the size of /?[, (see Figure 2). The convergence of G^pq to po as e ^ will then 
be fast {e.g. 0(e^)), so the heuristic also predicts a fast convergence of p^ to po as e ^ 0. On the other hand, if po 
contains any discontinuities or singularities (as in the examples of §2.2 - §3.1), then the main effect of G^ will be to 
smooth out the discontinuities or singularities. The rate of convergence predicted by the heuristic thus depends on the 
precise form of the singularity in question. As the heuristic is motivated by the exponential convergence of r"po to 
Pe (in the limit n -^ 00) it is natural expect it to make sense only when \(y{T^ \ {1}| < 1 uniformly in e > 0. This 
suggests, among other things, that the heuristic will work only when Tf itself has a spectral gap, or that correlations 
decay exponentially fast in the noiseless system (1). As will be seen, the available numerical evidence supports this 
claim. 

This paper examines five concrete examples, three of which are exponentially mixing (see §2) and two which are 
not (§3). It is found that the heuristic is valid for each of the exponentially mixing examples but does not work for the 
intermittent examples. Each example is accompanied by an explanation of the numerical techniques used to compute 
the invariant densities. Table 1 summarizes the numerical results described in the rest of this paper 

The remainder of this section is devoted to a brief review of background and motivation for this work. 



1.1 Motivation & background 

Chaotic systems are intrinsically unpredictable. The tools of ergodic theory and statistical mechanics are therefore 
necessary for questions concerning the long-time behavior of dynamical systems [7]. Invariant probabihty measures 



capture the long-time statistical properties of dynamical systems. 

Deterministic dynamical systems generally possess a multitude of invariant probability measures. For example, 
fixed points and periodic orbits support invariant measures. However, when the system in question is chaotic, most 
of these invariant measures are unstable and do not have directly observable effects on the long-time dynamics. The 
notion of stochastic stability (due to Kolmogorov) arises naturally as a criterion for separating these unstable invariant 
measures from those which are physically relevant: because no real physical system can be completely isolated from 
the rest of the universe, every physical experiment is susceptible to the effects of noise. Thus only stochastically stable 
invariant measures have directly observable effects on the long-term statistics of dynamical systems. 

One of the earliest general results on the stochastic stability of invariant measures concerns systems with uniformly 
hyperbolic attractors. Such systems possess Sinai-Ruelle-Bowen (SRB) measures [21], which (among their many 
properties) capture the statistical properties of a set of initial conditions of positive Lebesgue measure. Kifer proved 
that SRB measures of uniformly hyperbolic systems are stochastically stable [11]. However, his method of proof 
cannot address the question of convergence rate in the limit of small noise. In [2], Baladi and Young examine the 
question of stochastic stability and the rate of convergence in the setting of expanding maps and convolution-type 
perturbations. Blank and Keller, in subsequent work, proved results for more general perturbations of 1-dimensional 
maps [3]. 

In addition to understanding more deeply the stochastic stability of invariant measures, the results described here 
may be relevant for numerical studies of dynamical systems with intermittent, metastable behavior. The injection of 
noise into a numerical simulation can help reduce initialization bias in numerical computations of expectation values 
via time averaging while adding controllable errors to the computed expectation value [14]. 

2 Exponentially mixing systems 

The examples in this section have one common feature: their transfer operators T/ all have spectral gaps. That is, 
|(t(T/) \ {1}| < 1. This implies the exponential decay of correlations. The heuristic estimate (7) or its variant (8) 
are expected to work for maps whose transfer operators have spectral gaps. This is found to be the case. For all the 
examples in this section, the common density 5 : R ^ M of the random variables ^k is taken to be 



(10) 



(The first two examples are maps on S^, here identified with the interval [0, 1]. The random perturbations should 
therefore be taken modulo 1.) The perturbations e^fe are thus uniform random variables on the interval [— e, +e]. 
Calculations using gaussian kernels wrapped around the circle (not shown here) indicate that the scaling is insensitive 
to the exact form of the kernel g. 

2.1 Smooth expanding circle map 

Consider f : S^ O defined by 

/(x) = 2a; + asiii(27ra;) (modi). (11) 

See Figure 1 . This map is clearly smooth and is uniformly expanding for < a < 57 • Its invariant density po is easy 
to compute by discretizing the Perron-Frobenius operator Tf on a uniform mesh and is shown in Figure 2. 
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Figure 1 ; A smooth uniformly expanding map. 
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Figure 2: The invariant densities p^ with e e i Oj ik' Tn^' I' ?79 ' I f ' ^^^ '■^^ ™^P ^^ ^)- '^^^ black curve is pq\ the 
densities p^ become flatter as e increases. The map parameter is a = 0.15. 



Before discussing the details of numerical calculations, it should be noted that one can prove (8) for the map 
(1 1) when the parameter a is sufficiently small. Although the analysis is simple and uses only standard techniques, 
it is nevertheless instructive and is included here for completeness. Let B denote the space C^ with the usual norm 
\\h]\B — max(||/i||oo, ||/i'||oo) , and let Bq be the subspace {h E B : Jj^j /i = O} . Note that Bo is T^-invariant for 
e > 0. It is not difficult to show that the Perron-Frobenius operator Tf associated with (11) has the property that 
1^/11 Bo < 1 when the parameter a is sufficiently small: differentiating (4) shows that ||(T/g)'||oo < CaHqHs for 
some constant Ca dependent on a, and Ca < 1 when a is small. Because J^^ Tjq = J^^ q ~ 0, the Poincare inequality 
shows that llT/glloo < ||(r/q)'||oo. So ||T/||bo = sup{||r/(z||B : q e Bn,\\q\\B ^ 1} < c < 1. 

Now let g be a probability density on S^ and let G^ be the associated averaging operator. Denote the (unique) 
invariant density of T^ = G^Tf by p^. Since ||Ge||B < 1, \\Te\\Bo < WTjWbo < 1 for all e > 0. Thus the restriction 
{I — Tf) \Ba of I — Tf to i?o has bounded inverse for all e > 0. A simple calculation shows that 

Pe - Po = [{I - Te) IboV^ {GfPa - po) ■ (12) 

Thus 



Up. - Polls <||(/-T,)-l||Bol|GeP0- Polls 

<(i-||7^/IIbo)"'||g.po- Polls. 



Also 



||G,Po - Polls = ||(/ - Tf) ■ (/ - Tf)-' ■ {GfPo - po)||s 

<||/-r,||Bo-||p.-polls. 

Since \\I -Tf\\Bo < 1 + ||T/||bo, this means 

IPc - Polls '^ ||Gepo - Polls (13) 

with e-independent constants in the ^ relation. Note that this already means pf — po converges to at the same rate as 
GfPo — po in the G' metric as e ^ 0. Combining Equation (9) and the fact that | |/9c — po| Is > I |Pe ~ Pol |i yields 

\\GfPa ~ polli < \\Pe - Polli < \\GfPo - Polls- (14) 

For the map (11), it can be shown that the unique invariant density po is smooth (see [1] for details). So Gfpo — po — 
ie^Pg + o (e^) and HGepo ^ Polli ^ \\GfPo — Polls when e is sufficiently small, so that 

IIGePo -polli ~ ||P£ -Polli ^ e^- (15) 

Thus the heurstic estimate holds for (1 1) when a is small. A little more is true: for any smooth expanding map /, 1 
is an eigenvalue of the Perron-Frobenius operator Tf with algebraic multiplicity 1 (see [1]). This means Tf \bo has 
spectral radius < 1. As \a (Tf) \ ~ lim„^oo | |r?| |^/", the facts above together with the previous argument show that 
for any smooth expanding map / there exists an integer A^ > 1 such that the heuristic holds for /^ = f ° f ° ■■■ ° f, 
i.e. if pi Ms the invariant density for Xfe+i = /^(x/j) + e^^ then ||pe ' — po||i ~ HGePo ~ Polli- 

To test the general validity of the heuristic (8) when a is not necessarily small, it is natural to choose a close to 
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Figure 3: The differences \\p^ — PoHi and ||GcPo ~ Polli as a function of e on a log-log graph. The slope is 1.996 ± 
0.0071. The slope is calculated using least-squares regression on the interval e e [10~'^'^, 10~^]. Note that this means 
data points on the far right are discarded. Throughout the paper, data points for which error bars are available are 
drawn as vertical lines with 3 horizontal marks: the vertical line marks the abscissa of the data point, the middle mark 
the ordinate, and the top and bottom marks are the upper and lower bounds on the error The error bars in this figure 
are too small to be seen. 



1/2tt = 0.15915494309189535.... In what follows, a is set to 0.15. In Figure 2, it can be seen that p^ becomes flatter 
as e increases. This is not suprising: as e — > oo, the noise dominates the dynamics and the invariant density is just the 
Lebesgue measure. As e ^ 0, the densities converge to po. 

The L^ distances | |pe — po 1 1 1 are plotted in Figure 3 as a function of e on a log-log scale. Clearly, \\p^ — Po 1 1 1 ~ ^"^ 
for some 7. The exponent 7 can be calculated as the slope of the line in Figure 3 using standard least-squares regres- 
sion: it is about 1.996 ± 0.0071 when 7 is computed via least-squares regression on the interval e G [10~'^'^, 10~^]. 



In comparison, GePo — Po = \^^Pa + o (^^) 



that 



impHes HGePo 
llPe-Polli -- 



■Polli 



e\ 



Thus, the heuristic estimate (8) predicts 



For this particular map, the invariant density po is smooth. Therefore the main effect of Ge on po is to "flatten" it out, 
consistent with the pictures of p^ in Figure 2. 

The numerical data thus provide evidence that the heuristic is valid for values of the parameter a which is not close 
to 0. The heuristic argument predicts a little bit more than just the exponent 7: it predicts that when the spectral gap 
of Tf is sufficiently large, the quantities \\p^ — Po 1 1 1 and | \G^p — po 1 1 1 will be comparable. This is also borne out by 
the data, as can be seen in Figure 4. The numerically computed spectral gap is about 0.848, which is quite large. This 
explains why | jp^ — po 1 1 1 and | jGePo — po 1 1 1 are so close in Figure 4. 



0.08 


1 


' /■ 


0.06 


- 


/ ■ 


1-norm q _ 04 


. 


/ 


0.02 


-y^ 


1 



0.1 



0.2 
eps 



0.3 



0.4 



Figure 4: The differences ||pe — Polli and HGcPo — Polli as a function of e on a linear graph. 



Numerical method. 

It is straightforward to compute the invariant densities shown in Figure 2 using a finite difference discretization of the 
Perron-Frobenius operator Tf. Let (^j denote the inverse branches of /: / o (^^ = id, j = 1, 2. (There are two branches 
because the map / is 2-to-l.) We can then write T/ as 

The numerical procedure is then: 

1. Identify S^ with the interval [0, 1] with periodic boundary conditions and let Xi = i/N, i = 0,1, 2, ..., N — 1, 
be a uniform grid of size N. 

2. Let p be an iV-vector with nonnegative entries. Define the matrix T by 



im. = Y.Tii 



p{4>j{xi)) 



{\n<t>Ai^))y 



(16) 



where we extend the A^-vector p to a function on [0, 1] via polynomial interpolation using the grid points closest 
to X. Note that interpolation is necessary becuase the vector p only contains values of the density on the grid 
and <j)j (xi) will generally not be a grid point. 

3. Using a numerical linear algebra package like ARPACK [13], which implements an iterative Arnoldi eigen- 
problem solver, compute the eigenvector po of T with the largest eigenvalue. This requires only a function for 
performing matrix-vector multiplies, which avoids storing and diagonalizing the entire matrix T. 



In the computations shown in this section, the polynomial interpolation of p utilizes a stencil with 6 points and the grid 
consists of TV = lO** points. For smooth maps like (1 1), the order of convergence (as N —> oo) is formally 6. This is 
confirmed empirically by numerical tests. Thus the finite difference scheme converges rather rapidly and provides an 
efficient way to compute po • 

Note that there is a closely related method, due to Ulam [10], which consists of forming a partition of the phase 
space and computing transition probabilities to form a stochastic matrix. The stochastic matrix corresponds to a finite- 
state Markov chain which mimics the dynamics of the map / and whose invariant probability distribution can be 
computed using numerical eigenproblem solvers. Ulam's method is more general: it applies to dynamical systems 
whose invariant measures are singular. As a method for computing invariant densities, however, it is empirically found 
not to be as accurate or efficient as the high-order finite difference schemes outlined above. This is perhaps not so 
surprising: Ulam's method is, intuitively, only a low-order finite element scheme for the equation T/po = Po- 

The computation of p^ relies on a similar finite difference scheme. Instead of discretizing T^ = G^Tf directly, it is 
easier to reuse the matrix T from the computation of po and multiply it from the left by a discretization G of Ge . The 
matrix G is constructed by 

1 I'Xi+e 



{Gp)^ = TT / P{x) dx. (17) 

The integral is evaluated numerically using the trapezoid rule. This simple scheme suffices here because it is second- 
order accurate, and with N = 2 x 10^ the expected error is on the order of 10^®. This is smaller than the L^ norms 
||Pe — Polli to be computed. ARPACK can then be applied to the matrix G ■ T. The reason that Tf is discretized to 
6th order while Ge is only discretized to 2nd order is empirical: numerical experiments indicate that discretizing Tf to 
higher order accelerates the convergence of ARPACK eigenproblem routines. This phenomenon is not completely un- 
derstood, but intuitively a higher-order discretization T of Tf reproduces the spectral properties of Tf more accurately. 
As Tf \b„ is strongly contractive, T will also be strongly contractive. 

In actual computations performed on Sun Ultra 5 workstations and on PowerPC G3/G4-based Macintosh comput- 
ers, the computation of each invariant density p^ can take anywhere from 10 seconds to 5 minutes. The variations in 
running time are sometimes quite unpredictable; they are certainly not monotic in e as one might expect. Note that 
the running time of ARPACK routines depends quite sensitively on the spectral decomposition of T, so the variations 
in running times provide some indirect evidence in the complexity of the e-dependence of the spectrum of T^ . In any 
case, the finite difference scheme is in general quite efficient and allows the systematic exploration of different values 
of e (shown above) and a (not shown here). 

The error bars in Figure 3 are computed by repeating the calculations with N — 10^ points and checking the 
numerical convergence of the resulting estimates for | Ip^ — Po 1 1 1 • This provides error estimates AA{e) for each com- 
puted value of A{e) = \ogiQ{\\pf^ — polli)- These error estimates can then be combined to provide an upper bound on 
the error A7 in the estimated exponent 7 by linearizing 7 about the computed values A{ei): 



n 

A7<-E 



n . 

2 — 1 



i^il#^ . ^^(^^ 



(18) 



where p — - J27=i logio (^0 '^^'^ a^ = — X]"=i (l^gio i^i) ^ P)"^ ■ Note that because of the high order of convergence 
of the finite difference scheme, the main source of error in estimating 1 1 r/iOg — po 1 1 1 actually comes from the evaluation 
of I |/5e — Pol |i using the trapezoid rule. Since p^ — po is continuous, the trapezoid rule is second-order, and this error 
dominates all others in the computation. It is therefore natural to combine the results of the two runs to obtain a more 
accurate answer via Richardson extrapolation. This has been done in computing the exponent 7. The error estimates 



10 



are therefore rather conservative. 

2.2 Piecewise-expanding map 

The second example is 

f{x) = { . (19) 

2(1 -.t), i<a;<l 

Strictly speaking, / is not expanding because f'(x) = 1 for < x < i . However, /^ — / o / is expanding and / falls 
within the class of piecewise expanding maps considered by Lasota and Yorke [12]. 
The invariant density of (19) is easy to compute analytically: it is 




The invariant densities p^ are plotted in Figure 5. Again, it can be seen that pe becomes flatter as e increases: the main 
effect of Ge is to smear out the jump discontinuities of po- Note that the "overshoot" visible in Figure 5 at .t = and 
a; = i, reminiscent of the Gibbs phenomenon, is not a numerical artifact. Rather it is the product of the action of the 
map and the random perturbation. More precisely, it is produced by the following mechanism: 

1. Consider T^pa with n = 1,2, 3,.... For n = 1, T^po = GePo looks very much like pa, but the action of Ge 
replaces the discontinuity with a linear transition region of 0(e) width. 

2. On the next iterate, the action of Tf then cuts G^po into two pieces and rearranges them in such a way as to 
produce the "overshoot" of 0(1) height and 0(e) width. The next application of Ge smooths out the transition 
some more but does not change the asymptotic e-dependence of the overshoot and the transition region. 

As TV^po converges to p^ with increasing n, this process is iterated to produce the overshoot structure. Seen in this 
light, the overshoot must have 0(1) height and 0(e) width, making a 0(e) contribution to \\pe — po 1 1 1 • 

Clearly, the main effect of G^ is to smear out the discontinuity in pq. So HGePo — Polli ^ e and the heuristic 
predicts 

\\Pe -Polll --£• 

This is consistent with the densities p^ shown in Figure 5. These L^ norms are plotted in Figure 6 as a function of e on 
a log-log scale: the slope 7 is 0.97±0.033, consistent with the prediction that 7 = 1. The linear scaling of ||/9e — poHi 
with e should also be apparent on a linear scale. This is indeed the case: see Figure 7. 

The proof from Section 2.1 can be adapted to show that there exists an A^ > such that if pe is the unique 
invariant density of Xk+i = f^ {xk ) + e^^ , then T^n is a bounded operator on the space B of functions of bounded 
total variation with the norm II-Hbv = IMIi + (total variation) and ||ryjv||BQ < 1. One obtains ||/0e — Polls ~ 
\\GePo — Polls- This result is, unfortunately, not so useful for this map: because po is discontinuous, G^Po will not 
converge to po in IMIs as e ^ 0. 

Numerical method. 

For this example, it is straightforward to implement the finite difference scheme of Section 2.1, with the following 
modifications: 

11 
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Figure 5: Invariant densities pc for (19) and its random perturbations, with e = (black curve), 0.0088, 0.0176, 0.0353, 
0.0707, 0.141, and 0.282. As e increases, the width of the transition region increases too. 
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Figure 6: The differences \\p^ — polli and ||Ge/9o ~ Po||i as a function of e on a log-log graph. The slope is 0.97 ± 
0.033. The slope is calculated using least-squares regression on the interval e € [10"'^, 10"^]. Note that some of the 
data points have error bars which are too small to be seen on this scale. 
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Figure 7: The differences \\p^ — Polli and HGcPo — Polli as a function of e on a linear graph. 

1 . One should avoid interpolating across discontinuities in / and in /' . Doing so is empirically seen to produce 
unacceptably large errors in the computation of Tp. For (19), this means the polynomial interpolation of p 
should not use stencils which contain or ^ as an interior point. 

2. In order to avoid interpolating across or ^, it is necessary to do one-sided interpolations, i.e. interpolate near 
the edge of the stencil. It is well known that one-sided polynomial interpolation on uniform grids can produce 
large errors. Rather than using Legendre interpolation, however, it suffices to simply increase the number of 
grid points and check that the resulting answer has converged numerically. 

In the computations shown in this section, the polynomial interpolation of p utilizes a stencil with 4 points and the 
grid again consists of A^ = 10^ points. Numerical tests show that the order of accuracy lies between 3 and 4. Thus the 
finite difference scheme is still sufficiently accurate to provide an efficient way to compute p^ . 

The error bars in Figure 6 are computed by repeating the calculations with A^ = 5 x 10"^ points and checking the 
numerical convergence of the resulting estimates for ||pe — po||i- As in §2.1, the main source of error in estimating 
\\pe — Po||i comes from the trapezoid rule. The grid explicitly contains the points of discontinuity, so the trapezoid 
rule is still second-order accurate. 

2.3 Quadratic maps 

The third example is the quadratic map / : [— 1, 1] O: 

/(x) = 0.9-ax2,a>0. (20) 

Note that / maps [—1, 1] into [0.9 — a, 0.9], so for a < 1.8 and e < 0.1, the Markov chain defined by (2) will always 
stay inside the interval [—1,1]. 
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Figure 8: The invariant densities p^ with increasing values of e. The map parameter is a = 1.7152100141023...; the 
critical point is sent to an unstable fixed point at 0.489320111422868... in 3 steps. The noiseless density therefore has 
three x^^/^ singularities. 



Unlike the previous examples, this map has a critical point: /' (0) = 0. The map is contractive in a neighborhood of 
this critical point. Whether / has a invariant density (as opposed to singular invariant measures) depends on the fate of 
the critical point. For example, if there exists an integer ?i > and a stable fixed point xq such that /"(O) = xq, then 
a positive amount of probability will collapse onto xq to create a 8 mass there. Another scenario which can prevent 
the existence of an invariant density is if /"(O) comes close to infinitely often as n — > oo. 

To ensure the existence of a density, it is enough to choose the parameter a so that the map satisfies the Misiurewicz 
condition: the critical point falls into an unstable periodic orbit after a finite number of iterates. Misiurewicz proved 
that when this condition is satisfied, the map / possesses an invariant density po [16]. The rest of this section considers 
only Misiurewicz maps in the family (20). Note that even when the Misiurewicz condition is satisified, the action of / 
creates a x~^/^ singularity at each forward image /"(O) of the critical point 0. Thus the invariant density po of / must 
contain a;^^/^ singularities, as one can see in Figure 8. 

The Perron-Frobenius operator Tf is known to have a spectral gap even though / is not uniformly expanding [20]. 
This map therefore provides a more stringent test of the heuristic estimate than the previous two examples. Because 
the invariant density po contains x~^/^ singularities, the main effect of G^ on po is to mollify these singularities. This 
is again consistent with the picture in Figure 8. The heuristic predicts then that 



\Pe -Polll 



\GePQ -Polll 



.1/2 



(21) 



The L}- norms \\p^ — po||i were calculated for a = 1.7152100141023... and plotted as a function of e on a log-log 
graph in Figure 9; the error bars mark 1 standard deviation from the computed value. Using least-squares regression 
on the interval e G [10^^, 10^^'^] to calculate the slope 7 yields 0.52 ± 0.056; the corresponding line is shown as a 
solid line. The data is consistent with the prediction that 7 — 1/2. More careful calculations are necessary to decide 
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Figure 9: The differences | jpc — po 1 1 1 and | jGePo — Po 1 1 1 as a function of e on a log-log graph. The slope of the best 
linear fit is 0.52 ± 0.056; the corresponding line is shown as a solid line. The slope is calculated using least-squares 
regression on the interval e e [10^'', 10^^'^]. Note that this means some of the rightmost data points are discarded. 
The error bars mark the mean square error in the computed value. 



whether 7 is exactly 1/2, as predicted by the heuristic. 

The computation is repeated for another Misiurewicz parameter, a — 1.777776174649396. Again, using least- 
squares regression on the interval e G [10^^, 10^'^] to calculate the slope 7 yields 0.46 ± 0.050. The results are shown 
in Figure 10: the data is again consistent with the claim that 7 ~ 1/2. 

Numerical method. 

Unlike the previous examples, attempts to compute invariant densities by discretizing T/ (or Tf) for (20) does not work 
consistently: ARPACK routines will converge only for some values of e and not at all for others. Sometimes ARPACK 
produces eigenvectors which have no obvious connection to the dynamics or to the known form of pq. This sensitive 
dependence on the value of e may be related to the extreme sensitivity of the map to the value of the parameter a: 
Misiurewicz parameters do not form an open set; nearby values may not even have invariant probability densities. It is 
also possible that the structure of the spectral decomposition of Tj is sufficiently complex that ARPACK routines could 
notproduce well-resolved answers.' Fortunately, because of the a;^^/^ singularities, the values of Ijpe — polli needed 
here are orders of magnitude larger than for the first two examples. (Compare Figures 3, 6, 9, and 10.) Furthermore, 
Tf has a spectral gap, so correlations decay exponentially fast. This suggests that the usual time-averaging procedure, 
based on the ergodicity of /, can provide sufficiently accurate estimates of the probability densities pe .: 

1. Partition the interval [—1,1] into N equal-sized intervals /j. 



' In contrast, Ulam's Markov chain method should work quite well for this map. This option was not explored because it was not necessary. 
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Figure 10: The differences | |pe — po| |i and | \GePo — po| |i as a function of e on a log-log graph. The slope is 0.46 ± 
0.050. The slope is calculated using least-squares regression on the interval e G [10~^, 10"'^]. The error bars mark the 
mean square error in the computed value. 



2. Pick a random (uniform) initial condition xq and compute xi , 0:2 , 
which Xk visits each of the intervals in the partition. 



, xm using (2). Record the frequencies with 



3. Let Pi be the relative frequency of the zth interval in the partition. By the ergodic theorem, pi should be approx- 
imately the probability /^ Peix) dx. The L^ distance between p^ and po can be computed by 



N 



\Pe -Polll 



Y.M^)-HQ)\- 



(22) 



i=i 



^N 



Let US denote the estimator ^j^-^ |pi(e) —pi(0)| by <i>(e); it provides estimates of ||/9c — polli- 

Applying the estimator $(e) to a set of noise amplitudes e^ and combining the results with least squares regression 
yields an estimator 7 of 7. One might expect to obtain better results by adapting the partition to better resolve the 
x^^/^ singularities (their positions can be determined a priori: they are located on the forward images of the critical 
point). However, it is easier to use uniform-sized partitions, and they appear to be sufficient for this calculation. 

The error analysis is standard. The estimator $(e) is biased because the finite partition used in Equation (22) 
induces a can only locate the zeros of p^ — po up to a length scale of A^^ ^ . More precisely, each sign change of p^ — pQ 
contributes an error of order N~'^ to the right hand side of Equation (22). As there are only a finite number of zero 
crossings, the overall bias is 0{N^'^). The partition size N in this calculation should be sufficiently large (see below) 
to make the bias negligible. The mean square error of 7 is thus assumed to be dominated by its variance. 
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The variance of 7 can be estimated from the variance of the $(e): the standard deviation of the estimated proba- 

£1 

M' 



bility Pi is Api ~ \/ ^, where M is the number of steps taken during the course of the computation.^ Summing over 



I gives 

standard deviation of $(e) < \ — . 

As 7 is linear in logj^gl'^'l^)); its standard deviation is bounded by 



1 

n 



V- / logio(eO -/A^ ^ 



\^V ^' J M E [$(£,)] 



(23) 



where fi ^ ^ J27=i loglo(e^), <^^ = ^ J27=i (logio(eO - /")^ and § ■ gp^ is an estimate of the variance of 
log]^o(^(^))- Since the expectation values E [<&(ei)] appear in the error bounds but are not available exactly, they are 
replaced by the estimates $(ei) themselves in the calculations. Note that this formula assumes that the estimates $(ei) 
and $(62) are statistically independent if ei 9^ £2. 

The results shown in this section have been computed using a partition of A^ = 2^® — 65536 intervals and 
Af = 1.6 X 10^ X Af w 10^" steps. Thus standard deviation of $(e) is on the order of 2.5 x 10^^. 

3 Intermittent maps 

The examples in the previous all exhibit exponential decay of correlations. In this section I examine two examples 
whose Perron-Frobenius operators T/ do not have spectral gaps and correlation functions decay algebraically. The 
subexponential decay is caused by parts of phase space where the map / is nonexpanding. This therefore provides a 
model of "intermittency." See [17, 15]. 

3.1 Circle map with neutral fixed point 

The first of the intermittent examples is the map 

, ._ _ j;i+", < a; < i 

/(^) = <! ' • (24) 

i < a: < 1 

This is a modification of the angle-doubling map x h^ 2x (mod 1) with a one-sided tangency to the diagonal at x = 0: 
limj;^o+ .f'{x) = 1- Dynamically, this means that f{x) « x when x is small and positive, so whenever Xk lands 
near and to the right of the origin, many subsequent iterates are required before the trajectory "escapes" from 0. The 
tangency has a significant impact on the dynamics: it can be proved that correlations decay like n^^^/" for this map, 
and that the invariant density has a x^" singularity at a; = [22]. See Figure 11. 
For this example, the perturbation kernel is taken to be 

1, -1 <x < 
g{x) = { . (25) 

0, otherwise 




This assumes that the number of points in a given interval follows a Poisson distribution. 
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Figure 1 1 : The invariant densities p^ with increasing values of e. The map parameter is a = 0.5. The noiseless density 
po therefore has a x~^-^ singularity. Note that this figure uses a log-log scale. 



This choice is arbitrary; using (10) does not affect the results. 

As in Section 2.3, the main effect of Gc is to smooth out the singularity in pQ. Thus the heuristic (8) would predict 
that 



\P<L -POIII 



\GtPo -Polli 



This prediction does not change when one uses the more general form of the heuristic (7) with n > 1. However, it 
can be seen in Figure 12 that ||pe — polli ^ e^ "^ in this computation a = 0.5, so we would expect e°-^ convergence 
as e ^ 0. And yet the computed exponent 7 is only 0.31 ± 0.028, significantly smaller than 0.5. This result can be 
verified by repeating the calculation for different values of a : for a = 0.3, the heuristic predicts e^-"^ convergence. 
The real exponent is 0.53 ± 0.056. And for a = 0.7, the heuristic predicts e°^ convergence. The real exponent is 
0.17 ± 0.033. The heuristic is qualitatively correct, though: as a increases, 7 decreases. Also, the computed exponents 
are consistent with the fact (see Section 2.1) that the heuristic always provides a lower bound for the error | jpe — po 1 1 1 ■ 
The numerical results indicate that the lack of a spectral gap can have a significant, qualitative impact on the rate 
of convergence of p^ to poi and hence on the degree of stability of pQ under random perturbations. 

Numerical method. 

The computations for this example again rely on the finite difference scheme of Sections 2.1 and 2.2. The only 
differences are: 

1. As in Section 2.2, it is important to avoid interpolating across discontinuities of /'. Again, this means using 
stencils which do not contain a; = and x = 1/2 as interior points. In fact, because of the singularity in po at 
0, the origin should be excluded from the grid. 

2. In order to resolve the x^" singularity in ^0(2;), it is necessary to make the grid finer near 0. The grid I use is a 
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Figure 12: The differences | |pe — po| |i and | \GePo — po| |i as a function of e on a log-log graph. The slope is 0.31 ± 
0.028 and is calculated using least-squares regression on the interval e e [10^'', 10"^'^]. 
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Figure 13: The differences | |pe — po| |i and | jGePo — Pol |i as a function of e on a log-log graph. The slope is 0.53 ± 
0.056 and is calculated using least-squares regression on the interval e e [10^*, 10^^]. 
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Figure 14: The differences | |pe — po| |i and | \GePo — po| |i as a function of e on a log-log graph. The slope is 0.17 ± 
0.033 and is calculated using least-squares regression on the interval e e [10^'', 10~^'^]. 



hybrid between a uniform grid and one which scales like a power law: near x = the grid switches to one with 
points Xj oc j^°',j — 1, 2, 3, .... This scaling is the same as in Young's tower construction [22]. 

In the computations shown in this section, the polynomial interpolation of p utilizes a stencil with 6 points and the grid 
consists of A^ = lO"' points. Numerical tests show that the order of convergence is between 5 and 6. Thus the finite 
difference scheme is sufficiently accurate to provide an efficient way to compute p^ . 

3.2 The Bunimovich stadium 

The last example is the stadium billiard: begin with a domain J7 C M^ which is the union of a rectangle and two 
semi-circular ends, and consider the dynamics of a free point particle inside 51 which collides with dfl elastically (see 
Figure 15). The state of the particle can therefore be represented by a point in fl together with a unit vector The 
discussion in this section focuses on the stadium map, which is the Poincare map / defined by the boundary dfl. That 
is, the map / takes as input a pair {x,9), where x e dil and 9 is an angle specifying the velocity of the particle, and 
outputs the position and velocity of the particle after the next collision has occurred.-' Mathematically, the domain of 
/ is homeomorphic to S^ x [0, tt], with the position variable being periodic. Note that unlike all other examples in this 
paper, the billiard map is 2-dimensional. The map / also has singularities: Df is discontinuous on the preimage of the 
vertical lines in Figure 17. The expository article by Chernov and Young [5] provides a clear survey of the statistical 
properties of billiards. While they do not discuss the stadium, they do explains many relevant ideas in terms of other 
billiard models. 



^This map falls outside of the framework set up in the Introduction, but most of the general discussion there applies to this example with only 
minor modifications. 
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Figure 15: The stadium. Two trajectories are shown here: a perturbed "bouncing ball" orbit between the top and 
bottom edges, and a near "whispering gallery" orbit which wound around the right circular wing. 



In order to carry out numerical calculations, a specific coordinate system is needed. The calculations described 
here adhere to the following coordinates: fix a reference point p* on dfl and specify points p on dfl by the length 
X of the arc subtended by p^, and p in the counterclockwise direction. The velocity vector is specified by the angle 
9 G [0, tt] between the vector and the tangent line to dil, again in a counterclockwise direction. See Figure 17. In 
these coordinates, the billiard map / preserves an invariant density po{x, 9) oc sin(0); the invariant density is uniform 
in the x (arclength) variable. The pair (/, po) is ergodic and has a positive Lyapunov exponent [4, 6]. 

For simplicity, rather than adding noise to both the x and 9 coordinates, only the 9 coordinate is perturbed in 
this example. The choice of 9 is natural: the geometry of the stadium suggests that small changes in 9 will be 
magnified very quickly; perturbations in x alone will not necessarily produce that effect. Furthermore, perturbations 
in 9 destroys all metastable periodic orbits, such as the so-called "bouncing ball" orbits. This ensures that the random 
dynamical system (2) has a unique invariant measure. One complexity which arises in adding noise to 9 alone is that 
the perturbation can no longer be purely additive: 9 must lie between and tt. In order to satisfy this constraint, the 



following kernel is used: 



ge{9M 



e+Soid ' 



= < 



2e' 



£+7r-6l„,j ' 



< e„., < e 
TT — e < 



(26) 



< TT 



This is a somewhat arbitrary recipe; it is not clear how much of the results in this section are due to the arbitrary nature 
of this specific recipe and how much is truly intrinsic to the stadium. 

Figure 16 shows the difference p^ — pQ. The particular stadium used in the computation consists of a square 
[—1,1]^ with two semi-circles of radius 1 attached. The arclength variable therefore ranges from up to the perimeter 
L = 27r + 4 of the stadium. In these coordinates, then, the vertical strips B = [tt/2, 2 + 7r/2] x [0, tt] and D = 
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Figure 16: Level sets of pc — po, with e = 0.1. 
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Figure 17: Illustration of the coordinate system. The angle 9 ranges from to tt while the arclength (position) x ranges 
from to L, the perimeter of the stadium 17. 
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Figure 18: Dynamically-generated support of the test function (f)^. The dark black set is {(p^ ~ —1} , the white set 
{0£ — +1}, and the remainder (gray) is {(f)^ = 0} . 



[37r/2 + 2, 37r/2 + 4] x [0, tt] correspond to the flat edges of the stadium, whereas the strips A = [— 7r/2, 7r/2] x [0, it] 
and C = [tt/2 + 2, 37r/2 + 2] x [0, tt] correspond to the circular ends. As one can see, p^ — po is more negative near the 
middle of the strips B and D and more positive near the edges of the strips A and C. This is not surprising: the effect 
of the perturbation is to decrease the amount of probability near the vertical bouncing ball orbits, which correspond 
to the parts of B and D near 9 = ^, and the asymmetry in the recipe (26) tends to create shallower trajectories with 
angles nearer or tt. 

Because po is smooth, the effect of applying Ge to po once is to smooth it out further Thus the heuristic predicts 
l/Oe ■" Po||i '^ e^- In this case, it is entirely possible that the heuristic (8) does not always apply. Instead, Equation 
(7) may be needed with n > 1. It is clear that, at least for n = 2, the main effect of the singularties in Df is to 
introduce "ridges" into T^po, i-e. lines in S^ x [0, L] along which T^po is not differentiable in one direction. Explicit 
calculations show that such ridges contribute a termof O(e^) to ||Tg^/5o ^ Po Hi- 
Numerical method and results. 

Because of the non-smoothness of the boundary dil, the stadium map / is also not smooth. Unlike the one-dimensional 
case (see Sections 2.2 and 3.1), these sets of discontinuity are no longer mere point sets but have geometric structure. 
This makes the direct discretization of Tj troublesome and the finite difference method of §2. 1, §2.2, and §3.1 difficult 
to apply. And, unlike the quadratic map in Section 2.3, the noiseless stadium dynamics has slow decay of correlations: 



< cn^^. 



(j) ■ (Tf-)p) ■ po- <Ppn- I i^po 

S^x[0,L] JS^x[0,L] Js^x[0,L] 



It implies that when e is small but positive, the noisy dynamics will exhibit exponential decay of correlations but with a 
very large decay time constant. This, combined with the two-dimensional nature of the map, renders the computation 
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of I |/9e — po 1 1 1 by the method of §2.3 impractical: the error of that method is proportional to the inverse square root of 
the number of samples per bin, which would need to be quite large in this case. 
Instead, notice that the basic property (3) of the L^ norm tells us that if we let 



(j),{x) = < 



+1, /?£ - po > 

-1, pe- Po <0 
0, Pe - po = 



then 

||P£-Po||l= / 0e-p£- / 0£ • PO • 

Jm Jm 

This suggests that approximate knowledge of the sets E+ = {p^ — po > 0} and _E_ = {pe — po < 0} will allow us 
to estimate | jpe — po 1 1 1 . But the geometric structure in Figure 16 gives us quite a bit of information about E± : Let E''_l 
be union of the small strip [7r/2, tt/2 + 2] x [0, e] and its symmetric images under the action of the discrete symmetry 
group of the stadium. The set of initial conditions i?^ generates orbits which bounce many times with shallow angles; 
these are the so-called "whispering gallery" orbits. Explicit calculations show that the brighter regions in Figure 16, 
roughly indicating E+ = {p^ — po > 0} , corresponds to the union of the forward images of E^ under the billiard 
map after a few (3 or 4) iterations. Similarly, let E^ denote the union of [—tt/2, -k 12 + 2] x [7r/2 — e, 7r/2 + e] and its 
symmetric cousins. This set of initial conditions generates "bouncing ball" orbits, and the union of the forward iterates 
of E'^_ provides a rough approximation of E^ = {p^ — po < 0} (the dark region in Figure 16). 

This allows us to construct an observable 0^ as follows: let <j>^ take on the value — 1 on the two sets near the midline 
of Figure 18, where p^ — po is negative, and let it take on the value +1 on the sets near the boundaries, where pe — po 
is positive, and set (pe ~ elsewhere. The quantity 



$(£) 



(f>e{Pe ~ Po) dx de 



(27) 



is then a lower bound of | |pe — po 1 1 1 . By construction, cp^ should maximize $(e) so that $(e) is as close to | |pe — po 1 1 1 
as possible. 

The quantity <l>(e) is readily computable by averaging the values of the observable (j>^ over a long simulated 
trajectory. The statistical error can be estimated in a standard way [19]: the standard deviation A<I'(e) of the estimated 
$(e) is bounded above by 



A$(e) < 



2 • var(0e) 



(28) 



where the autocorrelation time t„„ — J2^=o ^(") '^^'^ C!{n) is the autocovariance function of the observable (f)^. The 
results are shown in Figure 19, where $(e) is plotted against e on a log-log scale. As usual, the error bars mark 1 
standard deviation. The slope is approximately 1.40 ± 0.022 when fitted over the range —2.5 < log]^o(^) ^ ^1- ^^ 
$(e) only provides a lower bound on \\p^ — po||i, the main conclusion of this computation is that \\p^ — po||i cannot 
decay faster than e^-^. 

For Figure 19, $(e) is computed, when e > 0.00625, using a trajectory consisting of 5.1 x 10* steps. When 
€ < 0.00625, the computation uses 8.1 x 10". This is necessary because in order to determine the exponent 7, one 
must compute log]^Q($(e)) accurately. But the absolute error in log]^Q(<l>(e)) is proportional to the relative error in 
$(e), so it is necessary to use more steps when <i>(e) is small. 
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Figure 19: This figure plots <l>(e) = \Jf 4>e{Pt — Pa) dx d9\ as a function of e, on a log-log scale. This provides a 
rigorous lower bound on the norm \\p^ — polli- The observable (f)^ is constructed to make the two quantities as close 
as possible. The slope of the best linear fit is 1.40 ± 0.022 when least-squares regression is performed over the range 
e e [10^^'^, 10^^] . This excludes points on the far right. The error bars mark 1 standard deviation. Note that some of 
the data points have errors which are too small to be seen on this scale. 



4 Concluding Remarks 

The calculations described in this paper lead to many intriguing questions. In addition to the question of how one 
might formulate and prove a precise version of the heuristic estimate (7), there is also the question of how to correctly 
predict the scaling of ||pe — polli in simple intermittent systems. For this question, large deviations theory [8, 18] or 
renormalization techniques [9] may be relevant. 

Invariant densities represent the most regular type of invariant measures. In most dissiptive systems, invariant 
measures are supported on attractors of zero measure. Among such singular invariant measures, the best-behaved are 
SRB measures: they represent the "nicest" invariant measures one can hope to have in dissipative chaotic dynamics. It 
is natural ask the corresponding question of convergence rates for SRB measures, say the rate at which jjl^ converges 
to /isRB in the total variation norm. The answer, however, is not so apparent. It seems sensible to conjecture that, at 
least in uniformly hyperbolic systems, the rate of convergence may be determined by the regularity of the conditional 
densities of /isRB along unstable manifolds. 

Another set of open questions have to do with dimension. All the examples considered in this paper exist in 
low-dimensional spaces. Are there other factors which can affect the rate of convergence in higher dimensions which 
cannot be seen in low dimensions? 

The numerical calculations described in this paper employed a variety of methods. The convergence properties 
of these numerical methods is not yet completely understood and await deeper analysis. In particular, the extent to 
which a discrete transfer operator T captures the detailed spectral structure of T/ and the effect that this has on the 
convergence of numerical eigenproblem routines is far from understood, as indicated by the sensitivity of the computed 
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density for the Misiurewicz maps of §2.2 to changes in the parameters a and e. 

Finally, as mentioned earlier, the results described here may be relevant for numerical studies of dynamical systems 
with intermittent, metastable behavior Noise can help reduce initialization bias in long-time numerical simulations 
while introducing controllable errors. This may be particularly useful in intermittent systems of the type examined 
in Sections 3.1 and 3.2. A first step in this direction was made in [14], but a systematic exploration is required to 
understand these ideas. In particular, to carry out this idea in practice will require algorithms which can cope with the 
additional complexity of separatrices and multiple ergodic components. 
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